---
title: Data Model
---
<h2>{{ page.title }}</h2>

<p>
  A series is identified by <em>key</em>, and a unique set of <em>tags</em> and <em>resource identifiers</em>.
</p>

<p>
<b>Tags</b> are indexed data stored with a given time-series. Tags can be used with both filters and aggregations.
</p>

<p>
<b>Resource Identifiers</b> are additional, non-indexed data stored with a given time-series. These were added to help address issues with ephemerality. In a system where "host" (for example) can change often, one may want to store "host" as a resource tag to avoid losing time-series data every time "host" changes. 
Currently, Resource Identifiers can be used with aggregations but not filters. 
</p>

<p>
  Keys, tag keys, and tag values can contain any valid unicode string,
  internally they are stored in UTF-8.
</p>

<p>
  A series represents <em>something</em> over time, where <em>something</em> is
  currently a set of data points.
</p>

<h3 id="data-points">
  Points
  <a class="link-to" href="#data-points"><span class="glyphicon glyphicon-link"></span></a>
</h3>

<p>
<a href="https://github.com/spotify/heroic/blob/master/heroic-component/src/main/java/com/spotify/heroic/metric/Point.kt">com.spotify.heroic.metric.Point</a>
</p>

<p>
  Each data point stores the timestamp at which they were sampled, and the
  value which they carry.
</p>

<p>
  The <em>timestamp</em> is stored as a 64-bit number (long), which
  represents the number of milliseconds since the unix epoch.
</p>

<p>
  The <em>value</em> is stored as a 64-bit floating point number (double).
</p>

<p>
  A data point is then typically represented as a JSON array, with two elements.
</p>

<pre><code class="language-json">
[&lt;timestamp&gt;, &lt;value&gt;]
</pre></code>

<h3 id="series">
  Semantic Series
  <a class="link-to" href="#series"><span class="glyphicon glyphicon-link"></span></a>
</h3>

<p>
  We strongly encourage the concept of semantic series.
</p>

<p>
  The idea behind semantic series is to move away from obscure identifiers and
  introduce metrics that are structured in a way that makes it easier for a
  human and a computer to reason about.
</p>

<p>
  So a series like the idle cpu utilization for a host could be identified as the
  following.
</p>

<pre><code class="language-json">
{
  "key": "system",
  "tags": {
    "site": "gew",
    "what": "cpu-idle-percentage",
    "system-component": "cpu",
    "cpu-type": "idle",
    "unit": "%"
  },
  "resource": {
    "podname": "pod-example-123-abc",
    "host": "database.example.com"
  }
}
</pre></code>

<p>
  This can also be represented in a more compact, human readable format as
  below.
</p>

<pre><code class="language-ts">
system { host=database.example.com, site=lon, what=cpu-idle-percentage, ... }
</pre></code>

<p>
  The need for semantic metrics becomes more apparent when you start to reason
  about <em>how to model</em> series for certain use cases using a traditional,
  hierarchical model.
</p>

<p>
  Assume that the above series was stored in a hierarchical time series
  database like the following.
</p>

<pre><code>database.example.com.system.cpu-idle-percentage</code></pre>

<ul>
  <li>
    The lack of <em>keys</em> makes deciphering a hierarchy challenging.
  </li>
  <li>
    The growth in the number of branches in the hierarchy becomes an
    organizational burden.
  </li>
  <li>
    Growth in the number of series limits discovery.
  </li>
  <li>
    The structure of the hierarchy determines <em>how</em> things are discovered.
  </li>
  <li>
    The filtering, or selecting of series is limited (e.g. wildcard).
  </li>
</ul>

<p>
  By promoting the use of tags, and a convention over which tags should be used
  <em>how</em>, the problem becomes more manageable.
</p>

<p>
  Instead of a strict hierarchy where discovery and expression is limited, you
  can have a multi-dimensional system that enables strong correlations and
  natural groupings.
</p>

<p>
  Conversely, if a given convention is followed, an administrator learning what
  a specific tag like <code>what</code> <em>means</em> will find it easier to
  navigate unknown contexts where that tag is used.
</p>

<h3>References</h3>

<ul>
  <li><a href="http://metrics20.org">Metrics 2.0 "An emerging set of conventions, standards and concepts around timeseries metrics metadata" by Dieter Plaetinck</a></li>
</ul>
